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I  SUMMARY 

't 

A  brief  description  of  the  research  carried  out  by  faculty,  staff, 
and  students  of  the  M.I.T.  Laboratory  forVln format ion  and  Decision  Systems 
under  ONR  Contract  N00014-77-C-0224  is  described.  The  period  covered  in 
this  status  report  is  from  October  1,  1980  through  September  30,  1981. 

The  scope  of  this  contract  is  the  development  of  an  overall  failure 
detection  system  design  methodology  and  of  methods  for  fault-tolerant 
control.  In  the -following  sections  we  overview  the  research  that  has  been 
performed  in  these  areas  during  the  indicated  time  period.  tfe-have-also 
included  a  list  of  the  papers  and  reports  that  have  been  and  are  being 

\  S  <  ■ 

written  as  a  result  of  research  performed  under  this  contract.  I«-_addition, 
the  period  mentioned  above.  Prof.  Alan  S.  Will sky ,  principal  investi¬ 
gator  for  this  contract,  visited  the  People’s  Republic  of  china  and  Japan. 

A  trip  report  was  submitted  fee  fehe-Mathematlcs  Program  (Code- 432),  and  a 
copy  of  that  report  is  included  as  Appendix  A  to  this  status  report. 
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I.  Robust  Comparison  Signals  for  Failure  Detection 

As  discussed  in  the  preceding  progress  report  [19]  for  this  project, 
a  key  problem  is  the  development  of  methods  for  generating  comparison 
signals  that  can  be  used  reliably  to  detect  failures  given  the  presence 
of  system  parameter  uncertainties.  Previously  we  have  made  some  initial 
progress  in  this  area,  as  is  described  in  [19]  and  in  more  detail  in  the 
Ph.D.  dissertation  of  E.Y.  Chow  [8]  and  in  the  paper  [16].  In  this  work 
we  used  the  idea  of  redundancy  relations  which  are  simply  algebraic  re¬ 
lationships  among  system  outputs  at  several  times.  Using  this  concept, 
we  proposed  an  analytic  method  for  determining  the  parameters  defining  a 
comparison  signal  that  is  optimum  in  the  sense  of  minimizing  the  influence 
of  parameter  uncertainties  on  the  value  of  the  signal  under  normal  operation. 

This  research  represented  a  significant  step  in  increasing  our 
understanding  of  robust  failure  detection  and  in  development  a  useful, 
complete  methodology.  There  were,  however,  several  key  limitations  to 
this  earlier  work.  Specifically, 

(a)  No  algorithmic  method  existed  for  identifying  and  constructing 
all  possible  redundancy  relations  for  a  given  system. 

(b)  No  method  existed  for  constructing  the  set  of  redundancy 
relations  which  can  be  used  to  detect  a  given  failure. 

More  generally,  no  method  existed  for  finding  all  sets 
of  redundancy  relations  which  allow  one  to  detect  each 
of  a  set  of  specified  failures  and  to  distinguish  among 
them. 

(c)  The  optimization  formulation  developed  is  complex  and  its 
use  for  systems  of  moderate  size  seemed  prohibitive.  Also, 
the  method  leads  to  a  choice  of  comparison  signals  that 
depends  upon  the  system  state  and  input.  While  this  may  be 
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appropriate  in  some  problems,  it  is  not  in  many  others. 
Furthermore  the  formulation  dealt  primarily  with  minimizing 
the  influence  of  parameter  uncertainties  on  comparison  signals 
under  normal  operation.  No  single,  satisfactory  formulation 
existed  for  incorporating  the  performance  of  the  particular 
signal  choice  under  both  unfailed  and  failed  system  conditions. 

(d)  No  coherent  picture  existed  for  describing  the  full  range  of 

possible  methods  for  using  a  particular  redundancy  relationship 
and  for  quantitatively  relating  performance  as  measured  by  the 
optimization  criterion  to  an  actual  failure  detection  algorithm 
based  on  the  redundancy  relation. 

During  this  past  year  we  have  initiated  a  new,  related  research  pro¬ 
ject  aimed  at  developing  algebraic  and  geometric  approaches  to  overcoming 
these  limitations.  We  have  identified  and  begun  to  develop  an  extremely 
promising  approach.  Our  work  to  date  will  be  described  in  some  detail 
in  the  forthcoming  S.M.  thesis  proposal  of  Mr.  Xi-Chang  Lon  [20].  In 
this  section  we  will  briefly  outline  the  main  ideas. 

Consider  a  linear  system  of  the  form 

x(k+l)  -  Ax(k)  (1.1) 


y(k) 


Cx(k) 


(1.2) 


For  simplicity  in  our  initial  discussion  here  and  in  the  first  part  of 
our  research  we  will  not  include  inputs  (and  hence  will  focus  on  sensor 
failures  and  not  on  actuator  failures) .  Let  y^ (k)  denote  an  extended 
observation  vector  of  length  p+1. 


y’(k) 


[y’ (k) ,  y’ (k+1) ,... ,  y' (k+p) ] 


(1.3) 


Any  vector 


4m 
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a' 

P 


] 


which  satisfies 

P 

a'y  (k)  =  £  a' ,y(k+i)  =  0 

P  P  i=0  P1 


(1.4) 


for  all  k>0  and  all  possible  x(0)  is  called  a  parity  vector  of  length  p 
and  (1.4)  is  called  a  parity  check  of  length  p. 

A  first  key  problem  is  to  identify  all  possible  parity  vectors  and 
to  develop  algorithms  for  generating  them.  The  key  to  this  is  the 
following:  define  the  vector  of  polynomials 
P 

p(z)  -  Z  a  .z1  (1.5) 

i=0  pl 

Then  a  is  a  parity  vector  if  and  only  if  there  is  an  nxl  vector  of  poly- 
P 

nomials  (n  is  the  dimension  of  x)  q(z)  so  that 

P*  (z)C(zI-A)-1  =  q’  (z)  (1.6) 

or,  equivalently,  if  and  only  if 

[p1 (z)  ,  -q' (z) ] 

is  in  the  left  null  space  of  the  matrix 

C 

zI-A 
*  - 

The  importance  of  this  result  is  that  the  last  characterization  identifies 
the  set  of  parity  relations  with  the  left  nullspace  of  a  particular  poly¬ 
nomial  matrix,  and,  in  fact,  this  allows  us  to  use  some  of  the  powerful 


tools  of  the  algebraic  theory  of  linear  systems  to  construct  all  possible 


redundancy  relations  and,  in  fact,  to  find  a  basis  consisting  of  parity 
checks  of  minimal  length.  As  length  directly  corresponds  to  the  amount  of 
memory  involved  in  a  parity  check  one  intuitively  would  prefer  short 
checks,  in  order  to  minimize  the  effects  of  parameter  uncertainties. 

Work  is  presently  continuing  in  developing  algorithms  for  constructing 
parity  checks  and  for  finding  parity  vectors  that  are  useful  for  particular 
failure  modes.  Specifically,  suppose  that  in  addition  to  the  normal  operation 
model  (1.1),  (1.2)  we  also  have  a  set  of  possible  failure  models 


x(k+l)  =  Aix(k) 

(1.7) 

y  (k)  =  c^tk) 

(1.8) 

i=l,...,N.  Suppose  that  a  vector  a  is  a  valid  parity  vector,  i.e.  there 

P 

is  a  q (z)  so  that  [p* (z) ,  -q'(z)]  is  the  left  nullspace  of 

w 

Suppose  also  that  there  is  no  polynomial  q^(z)  so  that  [p' (z) ,  -q|(z)]  is 
in  the  left  nullspace  of 


(1.10) 


In  this  case  the  parity  check  (1.4)  will  give  a  value  of  zero  if  there  is 
no  failure  but  will  generally  give  a  nonzero  value  if  failure  mode  i 
occurs.  Clearly  then  what  we  wish  to  identify  are  the  intersections  of 
the  left  nullspaces  of  the  matrices  in  (1.9)  and  (1.10).  As  discussed  in 


[20]  this  can  also  be  used  to  determine  sets  of  parity  checks  which  can 
distinguish  among  a  set  of  failures.  Work  is  continuing  on  obtaining 
algorithmic  solutions. 

The  research  described  above  is  aimed  directly  at  several  of  the 
limitations  mentioned  earlier.  Using  the  results  of  this  research  we  have 
also  initiated  research  of  a  more  geometric  nature  that  is  aimed  at  over¬ 
coming  the  remaining  limitations.  Specifically,  it  can  be  seen  that  the  set 
of  all  parity  checks  of  order  <  p  is  equivalent  to  the  orthogonal  pro¬ 
jection  in  (m  =  dim(y))  onto  the  orthogonal  complement  of  the 

range  of  the  matrix 


(1.11) 


For  example,  if  y*  =  (y^,  y^)  and  y2  =  ay^,  then  the  geometric  picture  is 
as  is  illustrated  in  Figure  1.1. 

In  terms  of  this  perspective,  parameter  uncertainties  manifest  them¬ 
selves  as  perturbations  in  the  range  of  the  matrix  (1.11).  For  our 

example,  if  a  .  <  a  <  a  ,  we  have  the  picture  depicted  in  Figure  1.2. 

min  —  —  max 

For  this  example  it  intuitively  makes  sense  to  use  as  a  parity  check  the 
projection  onto  a  line  which  is  "as  orthogonal  as  possible"  to  the  cone 
of  possible  observation  subspaces.  One  logical  criterion  is  to  choose  a 
line  which  makes  the  largest  possible  angle  with  the  cone  —  i.e.  which 
maximizes  the  smallest  angle  of  the  chosen  line  with  any  line  in  the  cone. 
The  idea  just  described  can  be  extended  to  the  general  case,  and  the 
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optimization  problem  can  be  stated  in  terms  of  singular  values  of  a  parti¬ 
cular  matrix.  Also,  this  approach  can  be  viewed  as  a  modification  of  that 
of  Chow  in  that  it  overcomes  the  state-dependent  nature  of  the  optimum 
parity  check  of  Chow.  Furthermore,  this  geometric  approach  can  also  be  used 
to  formulate  problems  which  allow  one  to  choose  the  optimum  parity  checks 
subject  not  only  to  performance  constraints  under  normal  operation  but  also 
when  specific  failures  occur. 

To  illustrate  the  point  mentioned  at  the  end  of  the  preceding  para¬ 
graph,  consider  our  simple  example  and  suppose  that  a  failure  results  in 
a  shift  in  a.  When  there  are  uncertainties  in  a,  this  results  in  a  picture 
as  illustrated  in  Figure  1.3.  Intuitively,  we  would  like  to  use  a  parity 
check  consisting  of  the  orthogonal  projection  onto  a  line  which  makes  a 
large  angle  with  lines  in  the  unfailed  cone  and  a  small  angle  with  lines 
in  the  failed  cone.  We  have  obtained  a  "Neyman-Pearon-like"  optimization 
formulation  for  this  problem  and  are  presently  studying  the  algorithmic 
solution  of  this  problem  and  the  formulation  and  solution  of  problems  of 
distinguishing  among  a  set  of  possible  failures. 


—  hi  i  ■  ii  i  mi  i  gfliriir » 


II.  Fault-Tolerant  Control  Systems 

In  the  preceding  progress  report  [19]  we  outlined  several  classes  of 
discrete-time  stochastic  control  problems  that  are  aimed  at  providing  a 
framework  for  gaining  an  understanding  of  fault-tolerant  optimal  control. 
These  problems  involve  a  finite-state  jump  process  denoting  the  operational 
status  of  the  system.  The  system  state  x  evolves  according  to  a  linear 
stochastic  equation  parametrized  by  the  finite  state  process.  During  the 
past  year  significant  progress  has  been  made  on  the  problems  described 
in  [19] .  These  results  will  be  described  in  detail  in  the  forthcoming 
Ph.D.  thesis  of  Mr.  H.J.  Chizeck  [18].  Specifically  we  have  accomplished 
the  following: 

(1)  As  mentioned  in  [19] ,  the  problem  is  straightforward  when 

p  is  independent  of  x.  However,  the  qualitative  properties 
of  the  solution  and  of  the  closed-loop  system  are  suprisingly 
complex,  and  a  wide  variety  of  types  of  behavior  can  be  ob¬ 
tained.  We  have  now  derived  a  series  of  results  and  constructed 
a  set  of  examples  which  allow  us  to  understand  the  possibilities 
in  more  detail. 

(2)  When  the  transition  probabilities  of  p  depend  on  x  the  problem 
becomes  one  of  nonlinear  stochastic  control.  This  problem 
reveals  many  of  the  critical  properties  of  fault-tolerant 
systems,  including  hedging  and  risk-avoidance.  In  much  of  our 
work  in  this  area  we  have  focussed  on  the  scalar  problem  where 
the  dependence  of  p  on  x  is  piecewise-constant .  A  cursory 
glance  at  this  problem  indicates  that  with  this  formulation 
the  problem  can  be  solved  (via  dynamic  programming)  by  ex¬ 
amining  a  growing  (as  we  go  back  in  time)  number  of  constrained 
linear-quadratic  problems. 

The  problem  has,  however,  a  significant  amount  more  structure 
which  we  have  now  characterized.  This  characterization  has 
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allowed  us  to  pinpoint  the  nature  of  hedging  and  risk-avoidance 
for  these  systems,  to  reduce  the  computational  complexity  of 
the  solution  by  a  substantial  amount,  and  to  obtain  a  finite 
look-ahead  approximation. 

(3)  We  have  also  completed  an  investigation  of  the  problem  described 
in  (2)  above  in  the  presence  of  bounded  process  noise.  In 
this  case  the  piecewise-quadratic  nature  of  the  solution  of 

(2)  is  lost  in  some  regions,  but  the  insight  from  (2)  allows 
us  to  obtain  an  approximation  to  the  cost-to-go  which  reduces 
the  problem  to  one  much  like  that  without  process  noise. 

(4)  We  have  also  obtained  some  initial  results  for  the  vector  version 
of  the  problem  of  (2) .  In  this  case  the  situation  becomes  far 
more  complex,  as  the  regions  into  which  one  must  divide  the 
state  space  at  each  stage  of  the  algorithm  have  complex  shapes. 
Work  is  continuing  on  obtaining  approximation  methods  for  these 
regions  much  as  we  did  for  the  costs-to-go  in  (3) . 

In  addition  to  these  problems  we  have  also  made  progress  in  a  fault- 
tolerant  optimal  control  when  we  have  noisy  observations  of  the  state. 
Specifically,  we  have  been  examining  a  problem  in  which  a  system  may  switch 
from  normal  operation  to  a  failed  condition  and  where  our  controller  must 
decide  if  and  when  to  switch  from  a  control  law  optimal  for  normal  opera¬ 
tion  (with  a  criterion  specific  to  normal  operation)  to  one  optimal  under 
failed  conditions  (perhaps  with  a  different  criterion) .  This  is  a  novel 
but  exceedingly  important  sequential  decision  problem.  Specifically, 
standard  statistical  decision  problems  are  aimed  at  providing  a  tradeoff 
between  incorrect  decision  probabilities  and  decision  delay.  For  control 
problems,  these  are  only  indirect  performance  indicators  -  e.g.  the  effect 
of  a  false  alarm  depends  on  the  performance  loss  resulting  from  switching 
from  the  normal  control  law  and  the  effect  of  detection  delay  depends  on 
the  performance  loss  from  using  the  normal  law  after  the  system  has  failed. 
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At  this  point  we  have  obtained  the  form  of  the  solution,  but  much  work 
remains  in  developing  algorithms  and  in  understanding  the  nature  of  the 
solution. 
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III.  Additional  Problems  in  Detection 

During  the  past  year  we  have  continued  and  initiated  work  along 
several  directions.  Brief  descriptions  follow: 

(1)  Decision  rules.  In  the  work  of  Chow  described  in  [8,  17,  19] 
we  describe  an  algorithm  for  computing  optimum  decision  rules. 

This  algorithm  was  complex  computationally,  and  extensions  to 
more  involved  detection  problems  using  this  approach  are  pro¬ 
hibitively  complex.  The  reason  for  this  is  that  optimum 
algorithms  attempt  to  partition  the  space  of  possible  condi¬ 
tional  probability  vectors  for  the  given  set  of  hypotheses  into 
decision  regions.  The  boundaries  of  these  regions  are  the  points 
where  two  decisions  yield  exactly  equal  performance.  It  is  our 
opinion  that  most  of  the  computational  complexity  is  due  to 

this  goal  of  finding  the  precise  boundaries,  which  involves 
obtaining  precise  statistical  predictions  of  the  evolution  of 
the  conditional  probabilities  under  each  hypothesis.  We  have 
recently  initiated  the  investigation  of  suboptimum  algorithms 
based  on  approximate  descriptions  of  the  evolution  of  conditional 
probabilities.  This  formulation  offers  the  possibility  of 
solving  far  larger  problems  at  reduced  computational  cost  and 
with  small  and  perhaps  negligible  performance  loss.  These  pos¬ 
sibilities  remain  to  be  examined. 

(2)  Complex  decision  problems.  As  discussed  in  [19]  there  is  an 
exceedingly  large  and  rich  class  of  problems  that  involve  con¬ 
tinuous  processes  coupled  together  with  discrete  processes  whose 
transitions  represent  events  in  the  observed  signals  or  the 
underlying  systems.  The  methods  we  have  developed  and  are 
developing  for  failure  detection  represent  in  some  sense  a 
first  step  in  attacking  the  simplest  problems  of  this  type, 
i.e.,  ones  in  which  we  must  detected  isolated  and  sporadic 
events.  We  have  also  initiated  investigations  of  problems  in 
which  we  wish  to  detect  and  identify  sequences  of  events.  Such 
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problems  are  of  significance  for  the  reliable  control  of  large 
scale  systems  and,  in  our  opinion,  hold  the  key  for  solving  many 
complex  signal  processing  problems.  In  the  preceding  progress 
report  [19]  we  outlined  a  generic  problem  formulation  for  event- 
driven  signal  generation.  During  this  past  year  we  have  built  on 
this  formulation  to  develop  a  structure  for  signal  processing 
algorithms  for  event-driven  signals.  The  building  blocks  for 
these  algorithms  axe  specialized  detection  algorithms  of  the 
type  one  uses  for  failure  detection,  and  the  key  problem  is  one 
of  developing  decision  mechanisms  based  on  the  outputs  of  these 
simple  algorithms.  As  discussed  in  [19],  the  major  issue  is 
one  of  pruning  the  tree  of  possible  sequences  of  events  in  an 
optimum  manner.  The  approximate  methods  described  in  (1)  above 
are  potentially  of  great  value  for  this  problem.  In  addition  to 
our  analytical  work,  we  are  also  working  on  several  specific 
applications.  This  experience  is  exceedingly  useful  in  providing 
insight  into  the  nature  of  problems  of  this  type.  At  this  time 
we  are  working  on  problems  of  electrocardiogram  analysis  based 
on  an  event -driven  model,  efficient  edge  detection  in  images, 
the  detection  of  objects  given  remote  integral  data  (which  is 
of  direct  application  to  problems  of  tomographic  tracking  of 
cold-temperature  regions  in  the  ocean) ,  and  optimum  closed- loop 
strategies  for  searching  for  objects.  The  fact  that  such  a  wide 
variety  of  problems  can  be  approached  essentially  from  one 
unified  perspective  indicates,  we  feel,  the  central  importance 
of  this  research  effort. 

(3)  Event-Driven  Models  for  Dynamic  Systems.  Based  on  the  perspective 
in  (2) ,  we  have  initiated  a  more  mathematical  aspect  of  our 
research  based  on  the  development  of  simplified  event -driven 
models  for  nonlinear  systems  affected  by  small  amounts  of  noise 
and/or  rare  events.  The  motivation  for  this  research  is  that 
the  exact  analysis  of  such  models  or  the  solutions  of  problems 
of  estimation  and  control  for  such  models  may  be  considerably 
more  complicated  (often  these  problems  are  intractable)  than 
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those  for  simplified  models  obtained  through  asymptotic  analysis. 
As  an  example,  consider  the  scalar  stochastic  system  described  by 
the  stochastic  differential  equation 

dx(t)  =  f (x(t) ) dt  +  £dw(t)  (3.1) 


where 

f  (x) 


—  (x— 1)  X  >  0 

- (x+1)  x  <  0 


(3.2) 


This  system  is  characterized  by  the  property  that  for  time  inter¬ 
vals  that  are  small  the  process  behaves  like  a  linear  process 
near  one  equilibrium  or  another,  while  for  long  times  the 
aggregate  process  sgn(x(t))  converges  (as  e-*0)  to  a  Markov 
jump  process.  Consequently,  one  might  expect  that  estimation 
of  x(t)  based  on  measurements  of  the  form 


dy  (t)  =  x(t)dt  +  dv(t)  (3.3) 

might  be  accomplished  based  on  viewing  the  process  as  the 
state  of  an  event-driven  linear  system.  More  generally,  one  can 
consider  analogous  models  for  other  nonlinear  systems  possessing 
multiple  equilibria  and  subject  to  small  noise.  We  already  have 
some  results  along  the  lines  indicated  for  simple  examples,  and 
we  are  continuing  to  investigate  more  general  situations.  Note 
that  the  estimation  algorithms  that  result  are  of  precisely  the 
form  considered  in  (2) .  It  is  our  feeling  that  this  research 
direction  represents  a  very  promising  approach  to  obtaining  a 
substantial  extension  to  the  class  of  estimation  problems  for 
which  tractable  solutions  can  be  found. 
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This  report  summarizes  the  trip  of  Prof.  Alan  S.  Willsky  to  the 
People's  Republic  of  China  and  Japan.  The  primary  purposes  of  this  trip 
were  to  participate  in  the  Bilateral  Seminar  on  Control  Systems  held  in 
Shanghai,  China  and  the  Seventh  Triennial  World  Congress  (held  in  Kyoto, 

Japan)  of  the  International  Federation  of  Automatic  Control.  The  following 
is  the  itinerary  followed  by  Prof.  Willsky: 

August  9-12  Shanghai,  China 

August  13-16  Xian,  China 

August  16-19  Beijing,  China 

August  19-22  Tokyo,  Japan 

August  22-29  Kyoto,  Japan 

Prof.  Willsky  served  as  technical  program  chairman  for  the  meeting 
in  Shanghai  and  as  one  of  the  organizers  of  the  activities  of  the 
official  IEEE  Control  Systems  Society  delegation  during  the  entire  visit 
to  China.  In  addition.  Prof.  Willsky  was  one  of  three  plenary  speakers 
during  the  Bilateral  Seminar.  The  subject  of  his  talk  was  an  introduction 
to  and  survey  of  methods  for  the  detection  of  abrupt  changes  in  signals 
and  systems.  Prof.  Willsky' s  research  in  this  field  has  been  and  is 
presently  supported  in  part  by  ONR. 

The  basic  purpose  of  the  visit  by  the  IEEE  delegation  was  to  establish 
ties  between  the  Control  Systems  Society  and  the  Chinese  Association  of 
Automation  and  to  provide  an  opportunity  for  discussion  among  researchers 
from  both  organizations.  To  achieve  these  objectives,  the  delegation 
organizers  structured  the  visit  to  allow  for  ample  opportunity  for  dis¬ 
cussion  and  for  members  of  the  IEEE  delegation  to  gain  knowledge  and  under¬ 
standing  about  China,  the  Chinese  people,  and  research  in  China.  In  addition 
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to  the  3-day  meeting  in  Shanghai,  there  were  also  visits  to  Xian  and  Beijing. 
Cultural,  social,  and  technical  activities  were  organized  in  both  of  these 
cities.  In  Xian  the  delegation  visited  Xian  Jiaotong  University,  and 
Prof.  Willsky  was  involved  in  a  discussion  of  implementation  issues  for 
digital  control  systems.  Also  involved  in  this  discussion  was  Dr.  Stuart  L. 
Brodsky  of  ONR.  In  Beijing  a  technical  interchange  was  held  at  The 
Great  Hall  of  the  People. 

Overall  the  visit  to  China  was  exceedingly  worthwhile.  The  meeting 
in  Shanghai  was  a  significant  success,  and  the  contacts  made  there  will 
allow  for  continued  interaction.  In  particular,  a  number  of  Chinese  re¬ 
searchers  expressed  great  interest  in  Prof.  Willsky’ s  lecture  and  provided 
him  with  information  and  publications  concerning  research  on  failure 
detection  and  adaptive  control  in  China.  The  visit  to  Xian  Jiaotong 
University  was  also  valuable,  as  it  provided  the  opportunity  to  see  one 
of  China's  leading  and  fastest  growing  technical  universities.  Beyond 
these  specific  scheduled  events  the  many  informal,  unscheduled  discussions 
at  banquets  provided  further  information  about  research  at  institutions 
that  were  not  visited. 

The  other  major  portion  of  this  trip  was  the  IFAC  World  Congress, 
the  largest  (approximately  1500  attendees)  meeting  of  researchers  in  auto¬ 
matic  control.  In  addition  to  presenting  a  paper  on  implementation  issues 
in  digital  control  and  attending  various  technical  sessions.  Prof.  Willsky 
also  had  the  opportunity  to  discuss  research  topics  with  researchers  from 
many  countries.  In  particular,  Prof.  Willsky  engaged  in  numerous  discussions 
on  problems  of  abrupt  changes,  failure  detection  and  fault-tolerant  control. 
Prof.  Willsky  spoke  with  Prof.  K.  &strom  of  Sweden,  Prof.  L.  Ljung  of  Sweden, 
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Dr.  F.  Pau  of  France,  Prof.  V.  Utkin  of  the  Soviet  Union,  and  Prof.  A.  Halme 
of  Finland,  among  others.  These  discussions  were  of  great  value  in  updating 
Prof.  Willsky's  knowledge  of  related  research  around  the  world.  In  addition. 
Prof.  Willsky  also  was  able  to  learn  much  about  the  status  and  direction  of 
robotics  research  in  Japan.  As  this  represents  an  important  and  promising 
direction  for  future  research,  the  opportunity  provided  by  this  visit  to 
Jpan  was  a  significant  one. 
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Abstract 


The  formulation  of  the  decision  making  of  a  failure  detection 
process  as  a  Saye3  sequential  decision  problem  (3SDP)  provides 
a  simple  conceptualization  of  the  decision  rule  design  problem. 

As  the  opclnal  Bayes  rule  is  noc  computable,  a  methodology  that 
is  based  on  the  Baysian  approach  and  aimed  at  a  reduced,  computa¬ 
tional  requirement  is  developed  for  designing  subopclmal  rules. 

A  numerical  algorithm  is  constructed  to  facilitate  the  design  and 
performance  evaluation  of  these  subopclmal  rules.  The  result  of 
applying  this  design  methodology  to  an  example  shows  that  this 
approach  is  a  useful  one. 


*  This  work  was  supported  in  part  by  the  Office  of  Naval  Research  under  Contract 

No.  N00014-77-C-0224  and  in  part  by  NASA  Ames  Research  Cancer  under  Grant  NGL-22-009-124 . 


1 .  INTRODUCTION 

The  failure  detection  and  identification  (FDI) 
process  involves  monitoring  the  sensor  measurements 
or  processed  measurements  known  as  che  residual  [1] 
for  changes  from  its  normal  (no-fail)  behavipr.  Re¬ 
sidual  samples  are  observed  in  sequence.  If  a  failure 
is  judged  co  have  occurred  and  sufficient  information 
(from  che  residual)  has  been  gathered,  the  moaitoring 
process  is  stopped.  Then,  based  on  che  past  obser¬ 
vations  of  residual,  an  Identification  of  the  failure 
is  made.  If  no  failure  has  occurred,  or  if  the  in¬ 
formation  gathered  is  insufficient,  monitoring  is  noc 
interrupted  so  thac  further  residual  samples  may  be 
observed.  The  decision  co  incorrupt  the  residual- 
monicorlr.g  co  make  a  failure  identification  is  based 
on  a  compromise  between  the  speed  and  accuracy  of  the 
detection,  and  che  failure  identification  reflects 
Che  design  tradeoff  among  che  errors  in  failure  clas¬ 
sification.  Such  a  decision  mechanism  belongs  co  the 
extensively  studied  class  of  sequential  tests  or  se¬ 
quential  decision  rules.  In  this  paper,  we  will  em- 
oioy  che  3ayesian  Approach  [2]  to  design  decision 
rules  for  FDI  systems. 

In  Section  2,  we  will  describe  che  Sayes  formu¬ 
lation  of  the  FDI  decision  problem.  Although  che 
opctmal  rule  is  generally  noc  computable,  the  struc¬ 
ture  of  che  3ayesian  approach  can  be  used  co  derive 
practical  subopclmal  rules.  We  will  discuss  the  de¬ 
sign  of  subopcimai  rules  based  on  the  Bayes  formula¬ 
tion  in  Seccion  3.  In  Section  4,  we  will  report  our 
experience  with  cht3  approach  co  designing  decision 
rules  through  a  numerical  example  and  simulation. 


2.  THE  3AYESIAN  APPROACH 

The  3SDP  fornulacion  of  che  FDI  problem  consists 
of  six  elements: 

1)  0:  the  set  of  states  of  nature  or  failure 
hypotheses.  An  element  3  of  0  may  denote  a  single 
cype  i  failure  of  size  v  occurring  at  time  t(9* 
(i,r,v))  or  the  occurrence  of  a  sec  of  failures  (pos¬ 
sibly  simultaneously),  i.e.  9“{  (ij_,T^  ,v^) ,...,( in,  T^, ' 
vn) ; .  Due  to  the  infrequenc  nature  of  cailure,  we 
will  focus  on  Che  case  of  a  single  failure. 

In  many  applications  it  suffices  co  just  idencify 
the  failure  cype  wichouc  estimating  the  failure  size. 
Moreover,  it  is  often  true  chat  a  detection  syscem 
based  on  (i,T,v)  fir  some  appropriate  T  can  also  de¬ 
tect  and  identify  the  type  of  che  failure  (i,r,v)  for 
v>v.  Thus,  we  may  use  ( i ,  T ,77)  co  represent  (i,r). 

In  the  aircraft  sensor  FDI  problem  [31,  for  inscance, 
excellent  results  were  obcained  using  this  approach. 
Now  we  have  the  discrete  nature  set 

3  -  {  (i,r)  ,  i-L . M.  t-1,2 . > 

where  we  assume  there  are  M  different  failure  types 
of  interest. 

2)  u:  che  prior  probability  mass  function  (?MF) 
over  the  nature  set  3.  This  ?MP  represents  the  a 
priori  information  concerning  che  failure,  i.e.  how 
likely  it  is  for  each  type  of  failure  to  occur,  and 
when  Is  a  failure  likely  co  occur.  3ecause  this  in¬ 
formation  may  not  be  available  or  accurrace  in  some 
cases,  che  need  to  specify  u  is  a  drawback  of  che 
3ayes  approach  for  such  cases.  Nevertheless,  we  will 
see  chat  it  can  be  regarded  as  a  parameter  in  the  de¬ 
sign  of  a  Sayes  rule. 

In  general,  u  may  be  arbitrary.  Here,  we  assume 
che  underlying  failure  process  has  two  properties: 
i)  che  M  failures  of  2  are  independent  of  one  jnocner, 
and  li)  the  occurrence  of  each  failure  i  is  a 
Bernoulli  process  with  .'successt  aari-oCer  a-.  The 
Bernoulli  orocess  (corres'und  lng  co  cue  Poisson  oroc- 
esa  in  continuous  time'  is  a  ca: — on  -o.iel  foe  f.:..ir«3 
in  physical  components:  the  :n  !co— .c -n_e  in m 


r  \  _  . 


r* 


-escribes  a  large  class  of  failures  (such  as  sensor 
failures)  while  providing  a  simple  approximation  for 
the  others.  It  is  straightforward  to  show  that 

jd.il-^ilsd-s)'*1  i»l,...,M,  r*l,2 . 

where 

M 

o-l  -  n  (1-a,) 
j-1  J 

M 

sdi-s.a-o,)'1!  :  o.a-o,)'1)-1 

j-i  j  j 

The  parameter  o  may  be  regarded  as  the  parameter  of 
the  combined  (Bernoulli)  failure  process  -  the  oc¬ 
currence  of  the  first  failure;  o(i)can  be  interpreted 
as  the  marginal  probability  that  Che  first  failure 
is  of  type  i.  Note  chat  che  present  choice  of  u  in¬ 
dicates  the  arrival  of  the  first  failure  is  memory- 
less.  This  property  is  useful  in  obtaining  time- 
invariant  subopcimal  decision  rules. 

3)  D(k) :  the  discrece  sec  of  terminal  actions 
(failure  identifications)  available  to  the  decision 
maker  when  the  residual-monicoring  is  stopped  at  time 
k.  An  elemenc  o  of  9(k)may  denote  che  pair  (j,t), 
i.e.  declaration  of  a  type  J  failure  to  have  occurred 
ac  time  c.  Alternatively,  5  may  represent  an  iden¬ 
tification  of  the  j-ch  failure  cype  without  regard 
for  the  failure  time,  or  ic  may  signify  the  presence 
of  a  failure  without  specification  of  its  cype  or 
time,  i.e.  simply  an  alarm.  Since  the  purpose  of  FDI 
is  :o  dececc  and  identify  failures  that  have  occurred 
2(k)  should  only  contain  identifications  chat  either 
specify  failure  times  at/before  k,  or  do  not  specify 
any  failure  time.  As  a  result,  the  number  of  ter¬ 
minal  decisions  specifying  failures  times  grows  with 
k  while  che  number  of  decisions  noc  specifying  any 
time  will  remain  che  same.  In  addition,  D(k)  does 
not  include  the  declaration  of  no  failure,  since  the 
residual-monicoring  is  scopped  only  when  a  failure 
appears  to  have  occurred. 

4)  L(k;9,a):  the  terminal  decision  cost  func¬ 
tion  ac  cine  k.  L(k;i,c)  denotes  the  penalty  tor 
deciding  ac0(k)  ac  time  k  when  che  true  state  of 
nacure  is  9«(i,r).  It  is  assumed  to  be  bounded  and 
non-negative  and  have  che  structure: 

/*l(<i,T)  ,4)  r<k,  SeP( k) 

L(k;(i,t),6W 

LU  r>k  4eP(k) 

where  LO.a)  is  che  underlying  cost  function  that  is 
independent  of  k;  Lp  denotes  the  penalty  for  a  false 
alarm,  and  it  may  be  generalized  to  be  dependent  on 
a.  It  is  only  meaningful  for  a  terminal  action 
(identification)  chac  indicates  the  correct  failure 
(and/or  time)  to  receive  a  lower  decision  cost  chan 
one  that  indicates  che  wrong  failure  (and/or  tine). 

further  assume  that  the  penalty  due  co  an  incor¬ 
rect  identification  of  the  failure  tine  is  only  de- 
r-sr.denc  on  the  error  of  such  an  identification.  That 
is  for  :*(j,t), 

U(i.T).Cj.e))  -  '-(i.i , (t-t ) ) 
jp.c  .‘or  ;  with  no  time  specification 


U(i.: 


Id. f) 


:  the  n-d  in.ens  i  onal  residual  (ohserva- 

*.  -.‘e  .-.hall  let  o'ril) . rfk)  (i.r)) 

j  r  ..ic  conoic.onol  density.  Assuming 


dak 


that  the  residual  is  affected  by  the  failure  in  a 
causal  manner,  its  conditional  density  has  the  prop¬ 
erty 

p(r(l) . r(k)i  (i.r) )*p(r(l) .... ,r(k) ' (0.-)) 

i-1, . . . ,M,  :>k 

where  (D,-)  is  used  co  denote  the  no-fail  condition. 
For  che  design  of  subopt imai  rules,  we  will  assume 
chac  the  residual  is  an  independent  Caussian  sequence 
with  V(cxa  matrix)  as  che  time-independent  covariance 
function  and  gi(k-t)  as  the  mean  given  that  the  fail¬ 
ure  (i.r)  has  occurred.  With  che  covariance  assumed 
to  be  the  same  for  all  failures,  che  mean  function 
g.(k-r),  characterizes  the  effect  of  che  failure 
(i.r),  and  ic  is  henceforth  called  che  signature  of 
(i.r)  (with  g<(k-r)*0,  for  i-0,  or  T»k).  Ue  have 
chosen  co  study  this  type  of  residuals  because  its 
special  structure  facilitates  the  development  of  in¬ 
sights  into  che  design  of  decision  rules.  Moreover, 
the  Caussian  assumption  is  reasonable  in  many  problems 
and  has  met  with  success  in  a  wide  varlecy  of  appli¬ 
cations,  e.g.,  [3]  [4],  (Ic  should  be  noted  chac  the 
use  of  more  general  probability  densities  for  che 
residual  will  noc  add  any  conceptual  difficulty.) 

6)  e(k,(i,t)):  the  delay  cost  function  having 

che  properties: 


c(k, (i.r)) 


c(i.k-r) 

>  0  r<k 

0 

T>k 

,-r) 

k.  >k->T 

After  a  failure  has  occurred  at  t,  there  is  a  penalty 
for  delaying  Che  cerainal  decision  until  tine  k>r 
with  the  penalty  an  increasing  function  of  the  delay 
(k-t).  In  the  absence  of  a  failure,  no  penalty  is 
imposed  on  the  sampling.'  In  this  study  we  will  con¬ 
sider  a  delay  cost  function  that  is  linear  in  the 
delay,  i.e.  c(i,k-r)-c(i) (k-t) ,  where  c(i)  is  a  posi¬ 
tive  function  of  che  failure  cype  i,  and  may  be  used 
to  provide  different  delay  penalties  for  different 
cypes  of  failures. 

A  sequential  decision  rule  nacurally  consists  of 
two  parts:  a  stopping  rule  (or  sampling  plan)  and  a 
terminal  decision  rule.  The  stopping  rule,  denoted 

by  »*(d(0) ,p (l;r(l) ) . s(k;r(l) , . . . ,r(k) ) , . . . )  is  a 

sequence  of  functions  of  the  observed  residual  sam¬ 
ples,  with  p(k;r(l) , . . . , r (k) )•! ,  or  0.  When 
p(k;r(l),...,r(k))-l,  (0),  residual-monitoring  or 
sampling  is  scopped  (continued)  after  the  k  residual 
samples,  r (1) , . . . ,r (k)  are  observed.  Alternatively, 
the  stopping  rule  may  be  defined  by  another  sequence 
of  functions  T-(p(0) ,i (1 ; r(l) ),..., i(k;r(l) .... . 

r(k)), _ ),  where  *(k;r(l) , . . . ,r(k))*l  (0)  indicates 

chac  residual-monitoring  has  been  carried-  on  up  co 
and  including  time  (k-1)  and  will  (not)  be  scopped 
after  cine  k  when  residual  samples,  r(l) , . . . , r (k)  are 
observed.  The  functions  t  and  -  are  related  co  each 
other  in  the  following  way 


•v(k;r(l) . r(k) )  -  :(k:r(l) . r(k))  • 

k-1 

2  .'l-:(s.r(l),...,r(s))] 

S-0 

with  .-(0)-:/0). 

The  terminal  decision  ruie  is  a  sequence  of 

f  unc  t  ions .  D*(d  '  0)  ,d  ( 1  r  <  1  > ' . d  i  k :  r  ( 11  .....  r  ik) )  , 

,..),  mapping  residual  sa-pjes,  r ' 1 r (k)  :nto 
the  terminal  action  set  ?(k! .  The  function 
d (k; r (1) ,  . .  r fk) )  r-; resents  the  decision  rule  used 
to  arrive  at  an  action  ' icenc i : ic at  ion)  if  =. '-aline 


A 
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is  stopped  ac  time  k  and  che  residual  samples,  r(l), 
. ..,r(k)  are  observed. 

As  a  result  of  using  the  sequential  decision 
rule  (},3),  given  (i.r)  is  the  true  state  of  nature, 
tne  total  expected  cose  is: 


-\:U,T),C»,3)l-i:  S,  {■>c<;r(l) . r(’=0)(c(lc.(i,T))-r 

k»0  "  ’ 

l(k;(i,t).d(k;:(l) . r(t)))l> 


The  3S3?  is  defined  as:  determine  a  sequential  deci¬ 
sion  rule  (J*,3*)  so  chat  the  sequential  Bayes  risk 
Cs  is  minimized,  where 

H  m 

=3 (},3)-KoCCi,f),(».0)l-E  :  u(i,r)Ug[(i,r>,O.0)l 

1-1  T-l 

(**,3*)  is  called  the  Bayes  Sequential  Decision  Buie 
(3S2R)  with  resoecc  to  a,  and  it  is  opcimal  in  the 
sense  that  it  minimizes  the  sequential  Bayes  risk. 

In  the  following  we  will  discuss  an  interpreta¬ 
tion  of  the  sequential  risk  for  the  FDt  probles.  let 
us  define  the  following  notation 

?r(-)-  Z  E-  i>(k;r(l) ....  ,r(k) ) 
k-1  ’* 

■m 

>  -•  ?(k) 
k-o 

S(k,:)»i  [r(D . r(k)|: 

-■  ( k ;  r  ( 1 ) . r(k)-l,d(k,r(l) . r(k))-S),  ScO 


relationships  among  the  various  oerfomance  issues. 

The  advantage  of  the  indirect  approach  is  that  only 
the  total  expected  cost  instead  of  every  individual 
performance  issue  needs  to  be  considered  explicitly  is 
designing  a  sequential  rule.  The  drawback  of  the  ap¬ 
proach,  however,  lies  in  the  choice  of  a  set  of  appro¬ 
priate  cose  functions  (and  socetines  the  prior  distri¬ 
bution)  when  the  physical  problem  does  noc  have  a  nat¬ 
ural  set,  as  it  doesn't  in  general.  In  this  case,  the 
Bayes  approach  is  most  useful  with  the  cose  functions 
(and  the  prior  distribution)  considered  as  design 
parameters  that  nay  be  adjusted  to  obtain  an  acceptable 
design . 

The  opeiaal  terminal  decision  rule  D*  can  be  eas¬ 
ily  shown  to  be  a  sequence  of  f ixed-saaple-size  tests 
[2].  The  determination  of  the  optimal  stopping  rule 
4*  is  a  dynamic  programming  problem  t 1 ] .  The  immense 
storage  and  computation  required  sake  ?*  impossible  to 
compute,  and  suboptimal  rules  -.sc  be  used. 

Despite  che  impractical  nature  of  its  solution, 
the  3SD?  provides  a  useful  framework  for  designing 
suboptimal  decision  rules  for  the  TDI  problem  because 
of  its  inherent  characteristic  of  explicitly  weighing 
che  tradeoffs  between  detection  speed  and  accuracy  (in 
terms  of  the  cost  structure).  A  secuenciel  decision 
rule  defines  a  sec  of  sequential  decision  regions 
3(k,i).  and  the  decision  regions  corresponding  co  the 
3SDR  yield  che  minimum  risk.  From  this  vantage  point, 
Che  design  of  a  suboptimal  rule  can  be  viewed  as  the 
problem  of  choosing  a  set  of  decision  regions  chac 
would  yield  a  reasonably  small  risk.  This  is  che  es¬ 
sence  of  che  approach  to  suboptimal  rule  design  that 
we  will  describe  next. 


?_■; 5(k,:)|i,r)«  /  ?(r(l) . r(k)  j  i,r)dr(l)  . .  .dr(k) 

S(fc.J) 


3 .  SUBOPTIMAL  rules 
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where  Pf(t)  Is  the  probability  of  scooping  to  declare 
a  failure  before  che  failure  occurs  ac  r,  i.e,  the 
probability  of  false  alarm  when  a  failure  occurs  ac 
me  t;  0  is  the  sec  of  terminal  actions  for  all  times; 
a(k,5)  is  che  region  in  che  sample  space  of  the  first 
k  residuals  where  the  sequential  rule  (4,0)  yields  the 
terminal  decision  3.  Clearly,  the  S(k,3)’s  are  dis¬ 
joint  sets  with  respect  co  both  k  and  4.  The  expres- 
s.ons  i(i,T)  and  ?((t.r),S)  are  che  conditional  ex¬ 
pected  delay  in  decision  (i.e.  stopping  sampling  and 
making  a  failure  identification)  and  che  conditional 
probability  of  eventually  declaring  5,  given  a  type  i 
failure  has  occurred  at  time  r  and  no  false  alarm  has 
been  signalled  before  this  time  respectively. 

?(.'i.r),f)  is  the  generalized  cross-dececcion  proba¬ 
bility.  Finally,  che  sequential  3ayes  risk  Us  can  be 
written  as 

>'.  . 

:s  S . 3 1  - E  :  i  ( i. r)  •’ L_?_(r)+(l-?_(r) )  [c(i)c(l,r)3- 

i“l  :’1  :  U(i,r), !)?((!,  r),j)|}  (1) 

•  „  rs 


The  Sliding  Window  Approximac tan 

The  immense  computation  associated  with  the  3SDR 
Is  partly  due  co  Che  increasing  number  of  failure 
hypotheses  as  time  progresses.  The  remedy  for  this 
problem  is  the  use  of  a  sliding  window  to  limit  the 
number  of  failure  hypotheses  co  be  considered  at  each 
time.  The  assumption  made  under  the  sliding  window 
approximation  is  that  essentially  all  failures  tan  be 
detected  wichin  W  time  steps  after  they  have  occurred, 
or  that  if  a  failure  is  not  detected  wichin  this  time 
it  will  noc  be  dececced  in  che  tuturi. .  Here,  the  win¬ 
dow  size  W  is  a  design  parameter,  and  it  should  be 
chosen  long  enough  so  chat  detection  and  identification 
of  failures  are  possible,  but  short  enough  so  chac 
Implementation  is  feasible  [11...  ,. 

The  sliding  window  rule  (:",d'")  divides  the  sample 
space  of  che  sliding  window  of  residuals  (r(k-V+l) , 
...,r(k)>,  or  equivalently,  the  space  of  vectors  of 
posterior  probabilities,  likelihood  ratios,  or  log 
likelihood  ratios  (1)  of  the  sliding  window  of  failure 
hypocheses  inco  dlsjoinc  time-independent  sequential 
decision  regions  (Sq.S^  ,  . . ,  ,3,.; .  Because  che  residuals 
are  assumed  to  be  Gaussian  variables,  it  is  simpler  to 
work  with  L  (which  is  related  ta  i  by  a  constanc) : 

Uk)-flg(k) . C..,(k)!  • 

where 


Equation  (1)  indicates  that  the  sequential  Bayes 
isc  is  a  weighted  to.-.b  (nation  of  the  cor.dtional  false 
let-  trobability,  expected  delay  co  decision  and 
r* : s - detect  ion  probabii ities .  and  che  optimal  sequen- 
: 1  t:ie  (J’.D*!  min -.-.ire.-:  such  a  combination.  From 
vantage  point,  the  cost  functions  (L  and  c)  and 
crier  distribution  ( .)  mrovide  for  che  wuipheinq, 

:  pas  i !  far  inairvuil.-  s-.i-o  ify  ing  the  tradeoff 
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KVwcSg,  and  we  will  proceed  to  cake  one  more  obser¬ 
vation  a:  the  residual.  The  3ayes  design  problem  is 
to  detemine  a  set  of  regions  { S«,Sj , . . .  ,Sj|!  chac  min¬ 
imizes  the  sequential  risk  b’j({Si}).  This  represents 
a  functional  minimization  problea  for  which  a  solution 
is  generally  very  difficult  to  determine.  A  simpler 
alternative  to  this  problea  is  to  constrain  the  deci¬ 
sion  regions  to  cake  on  special  shapes,  (Si(f)l,  chat 
are  parameterized  by  a  fisted  dimensional  veccor,  f, 
of  design  variables.  Then  the  resulting  design  pro¬ 
blem  involves  the  determination  of  a  sec  of  parameter 
values  f*  chac  minimizes  the  risk  U^(f).  Ve  will 
focus  our  attention  on  a  special  set  of  parametrized 
sequential  decision  regions,  because  they  are  simple 
and  they  serve  well  Co  illustrate  chac  Che  3ayes 
formulation  can  be  exploited,  in  a  systematic  fashion, 
to  obcain  simple  suboptimal  rules  chac  are  capable  of 
delivering  good  performance.  These  decision  regions 
are: 

S(j,c)«U(k)  :  L(k;J,c)>f(j,c), 

e”*(j ,c)(L(k;j,c)-f(j,c)] »e”*(i, t) [l(k;i,T)-f(i,r), 
(i.  Of  (j  , t)  }  (3a) 

S<0,-)-U(k)  :  L(k;i.?)<f(i,0, 

i-1 . M.  T-0.....W-1}  (3b) 

where  S(j,t)  is  the  stop-co-deelare  (j,k-t)  region  and 
S (0,-)  is  Che  concinue  region  (see  Fig.  1).  Generally 
the  s’s  mav  be  regarded  as  design  parameters,  but 
here,  c(j,c)  is  simply  taken  to  be  che  standard  de¬ 
viation  of  L(k,j,t). 

To  evaluate  U’f(f),  we  need  to  determine  the  set 
of  orobabilicies,  (Pr{l(k)eS(j ,c) , i(k-l)cS(0, -),... , 

l(W)cS(0,-)|i,r},  k>W,  j-0.1 . M,  c-0 . W-l}. 

which,  indeed,  is  che  goal  of  many  research  efforts  in 
the  so-called  level-crossing  problem  [5] .  Unfortu¬ 
nately,  useful  results  (hounds  and  approximations  of 
such  probabilities)  are  only  available  for  the  scalar 
case  [6], [71, [3].  As  ic  stands,  each  of  che  proba¬ 
bilities  is  an  integral  of  a  kMW-dimensional  Gaussian 
density  over  che  compound  region  S(0,-)x. . .xS(0,-) 
xS(j,t),  which,  for  large  kMW,  becomes  extremely  un¬ 
wieldy  and  difficult  to  evaluate. 

The  MW-dimensional  veccor  of  decision  statistics 
L(k)  corresponds  to  che  MW  failure  hypotheses,  and 
they  provide  che  information  necessary  for  the  simul¬ 
taneous  identification  of  both  failure  type  and  fail¬ 
ure  time.  In  most  applications,  such  as  the  aircraft 
sensor  FDI  problem  [3]  and  the  detection  of  freeway 
traffic  incidents  [4],  where  the  failure  time  need  noc 
be  explicitly  identified,  the  failure  time  resolution 
power  provided  by  the  full  window  of  decision  statis¬ 
tics  is  not  needed.  Instead,  decision  rules  chac 
employ  a  few  components  oc  L(k)  may  be  used.  The 
decision  rule  of  this  type  considered  here  consiscs 
of  sequential  decision  regions  chat  are  similar  to 
(3)  but  are  only  defined  in  terms  of  M  components  of 
Uk) : 


>j'-:L...;k)  :  uk-.j.w-iWj 

e'1  ( j  .•■-1 )  [ L(k;  j , *-l) -f  <  ]  >t _1  ( i.K-1 


I0*H,..1(k)  :  Uk.J.V-Defj  j-1 . >!;  (4b) 

.  is  the  scop-co-declarc-fai lure-j  region  and 
$g  is  the  continue  region.  Tt  should  be  noced  that 
the  use  of  (4)  is  effective  if  cross-correlations  of 
v.z“.ar.:res  i-ong  hypotheses  of  the  same  failure  type 
ac  different  times  are  smaller  chan  those  anong  hypo¬ 


theses  of  different  failure  types. 

The  risiv  for  using  (4)  is 

Uj ( f >*Lr, f i  J'?r(lv.1(V)eSj,59(k-l)  0,-: 

M  »  “  “ 

+  I  I  u(i,c)  I  I  (c(i)(k-t)+L(i,j)! 

i-1  r»l  k-nax ]  j-1 

x  ?TiU._lWcSy  S0(k-l)U,r) 


S0(k)-Ov.1(k)eso . W^V 

The  probabilities  required  for  calculating  che  risk 
are  given  by  the  recursion: 

p(Lj-i(k+l)|50(k),i.r)  - 

[/  ?av_1(k)!50(k-l),i,T)dL,.1(k))_1 

x  /  pOy^fk+DlU^O-O.Sgfk-D.i.r)- 

pav21a)!s0(k-i),i,T)di_._l(k)  k>w  (s 

?r{Lv_1(k)tSJ,  SQ0c-l)  !  i .  t  :•  -  ?r(S0(k-l)|i,t}- 

/  p(L.  .  (k)  jS-(k-l) ,  i .  r)d(,,  ,(k),  J-0.1 . M  (6 

Sj  *  —l  u 


PrU,,  ,(W)eS.  |  i,r)  -  !  ?(L,  ,W|i,T)dL,  ,  (W)  (7) 

j  gg  *-1_ 

For  M  small,  numerical  integration  of  (5)-(7)  becomes 
manageable . 

Unfortunately,  the  transition  density, 
paw.l(k+l)UW-1(k),50(k-l),i,T),  required  in  (5)  is 
difficult  to  calculate,  because  L^^fk)  is  not  a 
Markov  process.  In  order  to  facilitate  computation 
of  the  probabilities,  we  need  to  approximate  the 
transition  densicy.  Tn  approximating  che  required 
transition  density  for  ^ (k)  we  are,  in  fact,  ap¬ 
proximating  the  behavior  of  A  simple  approx¬ 

imation  is  a  Gauss-Markov  process  i(k)  that  is  defined 

by 

l(k+l)  »  A£  (k)  +  1('<^1) 

EU(k)E'(t)}  -  3B'u0(k-c) 

where  A  and  3  are  MxM  constant  matrices  and  E  is  a 
white  Gaussian  sequence  with  covariance  equal  to  che 
(MxM)  matrix  3B’.  The  reason  for  choosing  this  model 
is  twofold.  Firstly,  just  as  -v-lOO  ,  t(k)  is 
Gaussian.  Secondly,  l  (it)  is  Markov  so  that  its  tran¬ 
sition  density  can  be  readily  determined.  In  order  to 
have  l(k)  behave  like  L.^fk),  we  sec  che  matrices  A 
and  3  and  the  mean  of  such  that 

e.  _{i(k>;-E.  :■  cs) 
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Moreover,  che  rnacrix  A  is  stable,  i.e.  che  magnitudes 
of  all  of  che  eigenvalues  of  A  ace  less  chan  unity, 
and  3  is  invertible  if  Cq  or  G.^  is  of  rank  M.  Be¬ 
cause  i  is  an  artificial  process  (i.e.  ;  is  not  a 
direct  function  of  the  residuals  r(k))  l(k)  can  never 
be  implemented  for  use  in  (4) . 

Ve  nay  choose  ocher  Markov  approximations  of 
L thac  aacch  xhe  n-step  cross-covariance  (l<n<W) 
instead  of  matching  che  one-step  cross-covariance  as 
in  (1C).  The  suitability  of  a  cricerion  for  choosing 
the  matrices  A  and  3,  such  as  (9)  and  CIO),  depends 
directly  on  che  failure  signatures  under  consideration 
and  nay  be  examined  as  an  issue  separate  from  Che 
decision  rule  design  problem.  Also,  a  higher  order 
Markov  process  nay  be  used  to  approximate  lw_j,.  How¬ 
ever,  the  increase  in  che  compucacional  complexicy 
nay  negate  che  benefits  of  the  approximation. 

.Vow  we  can  approximate  che  required  probabilities 
in  the  tisk  calculation  as 


?r:i-._1(k)£S.,S0Ck-l)|i,r;a?t.;i(k)sS.  ,S0(k-l)|i,T} 

j-0,1 . M  k>W 


and 


Pr-lC-.-sSj.  $0(k-l)|i,t} 

•?r{S0(k-l)|i,r>  f  p(l(k)|S0(k-l),i,T)di(k)  (11) 

where  we  have  applied  che  sace  decision  rule  Co  Z('x) 
as  Lj,-_]_(k) .  Therefore,  Sj  and  Sg(k-i)  denote  che 
decision  regions  and  the  event  or  continued  sampling 
up  to  cine  k  for  both  and  1.  Assuming  3"  I 

exists,  we  have 

?(:(k+l)|S0(k),i,T)  -  [/  pCz(k)IS0(k-i),i.t)di(k)r1 

so 

/.  i  ?(5(k+l)  »  (l(k+l)-AZ(k)  J !  i,r) 
s0 

?(Zfk)S0Ck-l),i.T)dZ(k),  k>*  (12) 
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re  p(i(k);i,r)  is  che  Caussian  density  of  f(k) 
er  the  failure  (i,r).  Now  che  integrals  (11)  and 
)  represent  no re  tractable  numerical  problems. 

in  che  evenc  thac  3  is  hoc  invertible,  che  cran¬ 
ia  r.  density  is  dsgenatace  and  (12)  is  very  difficult 
evaluate.  Very  often  this  probLem  can  be  circura- 
:ed  by  bacch  processing  the  residuals.  Thnc^is,  we 
consider  che  nodi:  i-ad  residual  sequence:  r(k)  * 
i .  - -veU  ,r '  f  vk-v+i) . r'(vk)!'  for  some  batch 


che  new  time  index.  In 


using  r(k)  we  have  to  augment  the  signacves  as: 

l3<(0) . gJ(v-l)]',  i»I . M.  3v  a  proper  choice 

of  v,  the  rank  of  Gq  ca.h  be  increased  :o  !!  and  3  will 
be  invertible. 

■Von-'Vlndow  Secuencial  Secasion  Pules 

Here  we  will  describe  another  siaple  decision 
rule  that  has  che  sane  decision  regions  as  Che  simpli¬ 
fied  sliding  window  rule  (4),  buc  che  vector  (t)  of  M 
decision  scaciscics  is  obeained  differently  as  follows: 

z(k+l)  -  A  z(k)  -e  3  r(k-rl)  (13) 

where  A  is  a  conscanc  scable  MxM  macrix.  and  3  is  a 
Mxa  conscanc  matrix  of  rank  M.  Unlike  che  Markov 
model  Z(k)  thee  approximates  iy_,(k),  t(k)  is  a 
realizable  Markov  process  driven  by  the  residual.  The 
advantages  of  using  z  as  the  decision  statistic  are: 

1)  less  storage  is  required,  because  residual  samples 
need  noc  be  scored  as  necessary  in  the  sliding  window 
scheme,  and  2)  since  z  is  Markov,  the  required  proba¬ 
bility  integrals  are  of  Che  forts  (11)  and  (12)  so  that 
the  same  integration  algorithm  can  be  directly  applied 
co  evaluate  such  integrals.  (It  is  possible  to  use  a 
higher  order  z,  but  the  added  complexity  will  negate 
che  advantages.) 

In  order  co  fora  the  statistics  z,  we  aeed  to 
choose  che  matrices  A  and  3.  -hen  che  failure  signa¬ 
tures  under  consideration  are  constant  biases,  3  can 
simply  be  sec  to  equal  G q,  and  A  can  be  chosen  to  be 
al,  where  0<a«l.  Then,  the  tern  3r  in  (13)  resembles 
g'V"^r  of  (2),  end  ic-  provides  the  correlation  of  the 
residual  with  the  signatures.  The  time  constant 
(1/1— a)  of  z  characterizes  the’  memory  span  of  z  just 
as  W  characterizes  thac  of  the  sliding  window  rules. 

More  generally,  if  we  consider  failure  signatures 
that  are  noc  constant  biases,  che  choice  of  A  may 
still  be  handled  in  Che  same  way  as  in  Che  constant- 
bias  case,  buc  che  selection  of  a  3  macrix  is  more 
involved.  With  some  insights  inco  the  nacure  of  che 
signatures,  a  reasonable  choice  of  3  can  often  be 
made.  To  Iiluscrate  how  this  may  be  accomplished ,  we 
will  consider  an  example  with  two  failure  modes  and  an 
a-dimensional  residual  veccor.  Lee 

g1(k-r)  -  3, 

g2(k-r)  *  32(k-t+l) 

That  is,  g^  is  a  conscanc  bias,  and  g2  is  a  ranp.  If 
3.  and  3,  are  not  multiples  of  each  ocher  a  siaple 
choice  of  3  is  available: 


si" 

s;  - 

&  _ 

If  3, *0^3  and  3,*a,3,  where  and  a,  are  scalar  con¬ 
stants,  che  above  choice  of  i  has  rank  one  and  ts  noc 
useful  for  identifying  either  signature.  Suppose  we 
batch  process  every  two  residual  samples  together,  i.e 
we  use  che  residual  sequences  r(k)*(r ’ (2k-l) , r ' (2k) ] ' , 
k»l,2,....  Then  we  can  set  3  to  be 


3  * 


Thus,  the  first  and  ::c:.*r.:  rows  o:  f  aarCurc  : - ;rn- 
scanc-bLas  and  ra-.j  nature  ana  r.-sroct  i .  ;lv 


FA- 2  A 


.  is 


j*C 


(17) 


(and  this_3  has  rank  cvo)  .  The  use  os  the  modified 
resudual  r(k)  in  this  case  causes  no  adverse  effect, 
since  it  only  lengthens  slightly  the  interval  between 
tines  when  seminal  decisions  nay  be  made.  A  big  in¬ 
crease  in  such  intervals  i.e.,  the  batch  processing 

of  r(k) . r(k+v)  simultaneously  for  large  v,  may 

however,  be  undesirable.  For  problems  where  the 
signatures  vary  drastically  as  a  function  of  the 
elapsed  tiae,  or  the  discinguishability  among  failures 
depends  essentially  on  these  variations,  the  effec¬ 
tiveness  of  using  z  diminishes.  In  such  cases  the 
sliding  window  decision  rule  should  provide  bccter 
performance  because  of  ics  Inherent  nature  to  look 
for  a  full  window's  worth  of  signature. 

Probability  Calculation 

An  algorithm  based  on  1-dimensional  Gaussian 
quadrature  formulas  [9]  has  been  developed  Co  compute 
the  probability  integrals  of  (11)  and  (12)  for  the 
case  M*2.  (Ic  can  be  extended  to  higher  dimension 
with  an  increase  in  computation.)  The  details  of  this 
quadrature  algorithm  is  described  in  [1].  Ics  accu¬ 
racy  has  been  assessed  via  comparison  with  Monte  Carlo 
simulations  (see  the  numerical  example).  With  this 
algorithm  we  can  evaluate  the  performance  probabili¬ 
ties  and  risks  associated  with  the  suboptimal  decision 
rules  described  above. 


Risk.  Calculation 

In  the  absence  of  a  failure,  the  conditional 
density  has  been  observed  to  essentially  reach  a 
steady  stare  at  rooe  finite  time  T>W.^  Then,  for  k>T 
we  have 


?r{ l(k) eSj (Sg(k-l) ,0-}  -  bj  (14) 

PrflOOcSj.Kk-DeSg . l(r)eS0|S(T-l)  ,i,r)  - 

bj(k-r|i)  k>T>T  (15) 


That  is,  once  sceady  state  is  reached,  only  the  rela¬ 
tive  time  (elapsed  time)  is  important.  Generally, 
fialures  occur  infrequently,  and  decision  rule  with 
low  false  alarm  probabilities  are  employed.  Thus,  it 
is  reasonalbe  to  assume  1)  p<<l  ((l-o)T»  1),  and  2) 
?r(Sg(T) | 0,-}  »  1.  The  sequential  risk  associated 
with  (4)  for  M-2  can  be  approximated  by 

v  2  2- 
U  (f)»P -L_+(l-?_)  Z  a(l)l  Z  [c(i)t+L(i,j)]b.(c|i)1 

W  J  (16) 

where 


,  .  q-am-Sn) 

' F  1-6  (1-0) 

Next,  we  seek  to  replace  the  infinite  sum  over  t 
in  (16)  by  the  finite  sum  up  to  t-4  plus  a  t'erra  ap¬ 
proximating  the  remainder  of  the  infinite  sum.  Sup¬ 
pose  we  have  been  sampling  for  4  seeps  since  the  fail¬ 
ure  occurred.  Define: 

?.<)'  O-PrfKOeSjlSgU-D.i.O).  j-0,1,2 

If  we  stop  computing  the  probabilities  after  4,  we 
may  approximate 


^  Unfortunately,  we  have  not  been  able  to  prove 
such  convergence  behavior  using  elementary  techniques. 
More  advanced  function-theoretic  methods  nay  be  neces¬ 
sary  . 


C>i 


When  Che  signature  of  the  failure  model  is  a  constant 
(including  the  no-fail  case),  the  reasoning  behind 
(14)  holds,  and  we  can  see  that  P.Cjji)  will  reach  a 
sceady  state  value  as  :  (tne  eiaspsed  tine)  increases. 
Then,  (17)  is  a  valid  approximation  for  a  large  4. 

For  the  case  where  failure  signatures  are  not  constants, 
Che  probability  of  continuing  after  4  time  steps  (for 
sufficiently  large  4)  may  be  arbitrarily  small.  The 
error  introduced  by  (17)  in  the  risk  (and  performance 
probability)  calculation  is,  consequently,  small. 
Substituting  (17)  in  (16),  we  get 

W  2  _  2 

lT(f)*PrI._+(l-P_)I  o(i)[c(i)C.+  Z  L(i,j)P(i,j))  (18) 

'  ‘  ‘ i*l  1  j-1 

where 


2 
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1-1 


I  t  b,(tji)+b.(l! i)  4  + 
t-0  J  u 


1 

1-P  (01 i) 
4 


p4(jU> 

1-P4(0[i> 


(19) 

(20) 


?r  is  the  unconditional  false  alarm  probability,  i.e. 
the  probability  of  one  falae  alarm  over  all  tine,  t 
is  the  conditional  expected  delay  to  decision,  given 
that  a  type  i  failure  has  occurred,  and  P(i,J)  is  the 
conditional  probability  of  declaring  a  type  J  failure, 
given  that  failure  i  has  occurred.  From  Che  assumption 
thac  ?r(Sg(T) | 0, -} « 1  and  the  steady  condition  (14),  it 
can  be  shown  chat  the- mean  time  between  false  alarms  is 
simply  (l-bjj)"1.  Now  all  Che  probabilities  in  (IS)  — 
(20)  can  be  computed  by  using  the  quadrature  algorithm. 
Noce  chat  the  risk  expression  (18)  consists  only  of 
finite  suns  and  ic  can  be  evaluated  with  a  reasonable 
amounc  of  computational  effort.  With  such  an  approx¬ 
imation  of  the  sequential  risk,  we  will  be  able  to 
consider  the  problem  of  determining  the  decision 
regions  (che  thresholds  :)  thac  minimize  the  risk. 

Ic  should  be  noted  that  we  could  consider  choosing 
a  sec  of  thresholds  thac  minimize  a  weighted  combina¬ 
tion  of  certain  detection  probabilities  (P(i,j)),  the 
expecced  detection  delay  (t^) ,  and  the  mean  time  be¬ 
tween  false  alarms  ((1  -  b^;-*).  Although  such  an 
objective  function  will  not  result  in  a  Bayesian  de¬ 
sign  in  general,  ic  is  a  valid  design  criterion  that 
may  be  useful  for  some  application. 


Risk  Minimization 

The  risk  minimization  problem  has  two  features 
Chat  deserve  special  attention.  Firstly,  the  sequen- 
cail  risk  is  not  a  simple  function  of  the  threshold  f, 
and  che  derivative  with  respect  to  f  is  noc  readily 
available.  Secondly,  calculating  the  risk  is  a  costly 
task.  Therefore,  the  minimum-seeking  procedure  to  be 
used  must  require  few  funccion  (risk)  evaluations,  and 
it  must  not  require  derivatives.  The  sequence-of- 
quadracic-prograns  (Sq?)  algorithm  studied  by  Winfield 
[10]  has  been  chosen  co  solve  this  problem,  because  It 
does  not  need  any  derivative  information  and  it  appears 
to  require  fewer  function  evaluations  than  ocher  well- 
known  algorithms  [101.  Furthermore,  the  SOP  is  simple, 
and  it  has  quadratic  c; -.vurger.ee .  Very  briefly,  che 
algorithm  consists  ol  tha  following.  At  each  iteration, 
a  quadratic  surface  is  fitted  to  the  risk  function 
locally,  then  the  quadratic  -odel  is  'ininized  over  a 
constraint  region  (hence  tne  name  SO?).  The  risk 
funccion  is  evaluated  at  this  minimum  and  is  used  in 
che  surface  ‘ittinz  of  the  ht-rt  iteration.  The  de¬ 
tails  of  the  application  o:  SO?  to  ns*  minimization 
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LC1.2)-L(2,1>-10 


L(i.l)-L(2,2)-0 


Here,  ve  will  discuss  an  application  of  che  sub- 
optimal  rule  design  methodology  described  above  to  a 
numerical  example.  We  will  consider  the  detection 
and  identification  of  two  possible  failure  nodes 
(without  identifying  the  failure  tines) .  We  assume 
that  the  residual  is  a  2 -dimensional  vector,  and  the 
vector  failure  signatures,  g^(c) ,  i>l,2,  as  functions 
of  the  elapsed  time  t  are  shown  in  table  1.  The 
signature  of  che  first  failure  mode  is  simply  a  con¬ 
stant  vector.  The  firsc  coeponenc  of  gj(t)  is  a  con¬ 
stant,  while  the  second  component  is  a  ramp.  We  have 
chosen  to  examine  these  two  types  of  signature  be¬ 
havior  (constant  bias  and  ramp)  because  they  are  sim¬ 
ple  and  describe  a  large  variecy  of  failure  signatures 
that  are  coeaonly  seen  in  practice.  For  simplicity, 
we  have  chosen  V,  che  covariance  of  r,  Co  be  che 
identity  matrix. 

We  will  design  both  a  simplified  sliding  window 
rule  (Chat  uses  Ly_,)  and  a  rule  using  the  Markov 
statistic  t.  The  parameters  associated  with  the 
,  1,  and  r  are  shown  in  Table  2,  and  the  cost 
functions  and  the  prior  probabilities  are  shown  In 
Table  3.  To  facilicace  discussions,  we  will  intro¬ 
duce  the  following  terminology.  We  will  refer  to  a 
Monte  Carlo  simulation  of  the  sliding  window  rule  by 
SW,  a  simulation  of  the  rule  using  the  Markov  statis¬ 
tic  z  as  Markov  implementation  (Ml),  and  a  simulation 
of  the  nonimplemencable  decision  process  using  Che 
approximation  l  as  Markov  approximation  (MA) .  (All 
simulations  are  based  on  10,000  trajectones . )  The 
mocacion  Q20  refers  to  the  results  of  applying  Che 
quadrature  algorithm  to  the  approximation  of  by 
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Table  1.  Failure  signatures. 
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Table  2.  ?ari-.etec3  cor  L,._^,  1  and  z. 
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Table  3.  Cost  Functions  and  Prior  Probability. 

The  results  of  SW,  MA,  and  Q20  for  che  thresholds 
[3.35,  12.05]  are  shown  in  Figs.  2-6  (see  (15)  for  the 
definition  of  notations).  The  quadrature  results  Q20 
are  very  close  to  MA,  indicating  good  accuracy  of  che 
quadrature  algorithm.  In  comparing  SW  with  MA,  it  is 
evident  chat  che  Markov  approximation  (MA)  slighcly 
under-estiaaces  che  false  alarm  race  oc  the  sliding 
window  rule  (SW).  However,  the  response  of  che  Markov 
approximation  to  failures  is  very  close  co  that  of  che 
sliding  window  rule.  In  the  present  example,  is 

a  7-th  order  process,  while  it3  approximation  l  is 
only  of  firsc  order.  Tn  view  of  this  fact,  we  can 
conclude  that  1  provides  a  very  reasonable  and  useful 
approximation  of  Ly_, . 

The  successive  choices  of  thresholds  by  SQP  for 
the  sliding  window  rule  are  plotted  in  Fig.  7.  Mote 
"that  we  have  noc  carried  the  SO.?  algorithm  far  enough 
so  thac  che  successive  choices  of  thresholds  are,  say, 
within  .001  of  each  ocher.  This  is  because  towards 
later  iterations  che  performance  indices  become  rela¬ 
tively  insensitive  to  small  changes  of  che  f’s.  This 
together  with  che  fact  that  we  are  only  computing  an 
approximate  Bayes  risk  means  chat  fine  scale  optimi¬ 
zation  is  not  worthwhile.  Therefore,  with  the  approx¬ 
imate  risk,  the  SQP  is  cost  efficiently  used  co  locate 
the  zone  where  the  minimum  lies.  That  is,  Che  SQP 
algorithm  is  to  be  terminated  when  it  is  evident  chat 
it  has  converged  into  a  reasonably  small  region.  Then 
we  may  choose  the  thresholds  that  give  the  smallest 
risk  as  the  approximate  solution  of  the  minimization. 

In  the  event  thac  thresholds  that  yield  che  small¬ 
est  risk  do  not  provide  the  desired  detection  perfor¬ 
mance,  che  design  parameters,  L,  c,  u,  and  W  may  be 
adjusted  and  the  SQP  may  be  repeated  to  get  a  new  de¬ 
sign.  A  practical  alternative  method  is  co  make  use 
of  che  list  of  performance  indices  (e.g.  ?(i,j))  Char 
are  generated  in  che  risk  calculation,  and  choose  a 
pair  of  thresholds  thac  yields  the  desired  performance. 

The  performance  of  che  decision  rules  using  L._^ 
and  z  as  determined  by  SQ?  are  shown  in  Figs.  3-127 
(The  thresholds  for  ly_i  are  (3.35,  12.05]  and  those 
for  z  are  (6.29,  11.69].)  We  note  that  MI  has  a 
higher  false  alarm  rate  than  SW.  The  speed  of  detec¬ 
tion  for  che  two  rules  is  similar.  While  MI  has  a 
slighcly  higher  type-1  correct  detection  probability 
chan  SW,  SW  has  a  consistently  higher  b->(t;2)  (type-2 
correct  detection  probability)  than  MI.  By  raising 
the  thresholds  of  che  rule  using  z  appropriately,  we 
can  decrease  the  false  alarm  race  of  MI  down  to  thac 
of  SW  with  an  increase  in  detection  delay  and  slightly 
improved  correct  dececcion  probability  for  che  type-2 
failure  (with  ramp  signature).  Thus,  the  sliding 
window  rule  is  slightly  superior  to  the  rule  using  z 
in  che  sense  thac  when  both  are  designed  to  yield  a 
comourable  false  alarm  rate,  the  lacter  will  haie 
Longer  detection  delays  a.-.c  slightly  lower  correct 
detection  probability  (for  :■  re-2  failure).  In  view 
of  the  fact  that  a  decision  rule  js-.r.  ;  :  ;s  much 
sinyier  to  implement,  it  is  wort.'.--  of  b-ing  considered 
.is  an  alternative  to  tie  slio.n:  window  ruie. 


In  summary,  the  result  of  applying  our  decision 
rule  design  method  to  the  presanc  example  is  very 
good.  The  quadrature  algorithm  has  been  shown  to  be 
useful,  and  the  Markov  approximation  of  LI<w._^  by  1  is 
a  valid  one.  The  SQP  algorithm  has  demonstrated  its 
simplicity  and  usefulness  through  the  numerical  exam¬ 
ple.  Finally,  the  Markov  decision  statistic  z  has 
been  shown  to  be  a  worthy  alternative  to  the  sliding 
window  statistic  L,  , . 

5.  CONCLUSION 

A  methodology  based  on  the  Sayesian  approach  is 
developed  for  designing  suboptimal  sequential  deci¬ 
sion  rules.  This  methodology  is  applied  Co  a  numer¬ 
ical  example,  and  the  results  indicace  chat  it  is 
a  useful  design  approach. 
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Fig.l  Sequential  Decision  Regions  in  2  Dimensions 
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